Search CORE

57 research outputs found

Subfamily specific conservation profiles for proteins based on n-gram patterns

Author: F Fogolari
GP Raghava
H Joe
H W.
I Bahar
JC Wootton
JE Coronado
JK Vries
JK Vries
John K Vries
MO Dayhoff
MS Johnson
PC Mahalanobis
QW Dong
R Karchin
RD Finn
S Henikoff
S Henikoff
SF Altschul
SF Altschul
WS Valdar
WS Valdar
WS Valdar
Xiong Liu
Y Hou
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background A new algorithm has been developed for generating conservation profiles that reflect the evolutionary history of the subfamily associated with a query sequence. It is based on n-gram patterns (NP{<it>n,m</it>}) which are sets of <it>n </it>residues and <it>m </it>wildcards in windows of size <it>n+m</it>. The generation of conservation profiles is treated as a signal-to-noise problem where the signal is the count of n-gram patterns in target sequences that are similar to the query sequence and the noise is the count over all target sequences. The signal is differentiated from the noise by applying singular value decomposition to sets of target sequences rank ordered by similarity with respect to the query. Results The new algorithm was used to construct 4,248 profiles from 120 randomly selected Pfam-A families. These were compared to profiles generated from multiple alignments using the consensus approach. The two profiles were similar whenever the subfamily associated with the query sequence was well represented in the multiple alignment. It was possible to construct subfamily specific conservation profiles using the new algorithm for subfamilies with as few as five members. The speed of the new algorithm was comparable to the multiple alignment approach. Conclusion Subfamily specific conservation profiles can be generated by the new algorithm without aprioi knowledge of family relationships or domain architecture. This is useful when the subfamily contains multiple domains with different levels of representation in protein databases. It may also be applicable when the subfamily sample size is too small for the multiple alignment approach.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Incorporating background frequency improves entropy-based residue conservation measures

Author: B Schuster-Bockler
C Sander
CH Wu
CT Porter
CT Workman
D La
E Bindewald
G Cheng
GD Stormo
GE Crooks
H Yao
I Mihalek
J Pei
J Pei
J Pei
JD Watson
JM Johnson
JP Bielawski
K Sjolander
K Wang
Kai Wang
KW Plaxco
L Oliveira
LA Mirny
LA Mirny
M Clamp
M Gerstein
M Landau
O Lichtarge
OS Soyer
PC Ng
PS Shenkin
R Greaves
Ram Samudrala
RB Vilim
RM Williamson
S Jones
S Levy
SF Altschul
SR Eddy
SR Sunyaev
SS Hannenhalli
TM Cover
V Chelliah
WS Valdar
WS Valdar
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Several entropy-based methods have been developed for scoring sequence conservation in protein multiple sequence alignments. High scoring amino acid positions may correlate with structurally or functionally important residues. However, amino acid background frequencies are usually not taken into account in these entropy-based scoring schemes. RESULTS: We demonstrate that using a relative entropy measure that incorporates amino acid background frequency results in improved performance in identifying functional sites from protein multiple sequence alignments. CONCLUSION: Our results suggest that the application of appropriate background frequency information may lead to more biologically relevant results in many areas of bioinformatics

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows

Author: A Goncearenco
A Goncearenco
A Marchler-Bauer
A Truszkowski
AA Bogan
B Graves
B Ma
BA Shoemaker
BA Shoemaker
BA Shoemaker
BJ Smith
CA Goble
CM Yates
D Petrey
E Cukuroglu
FP Davis
H Perez-Sanchez
HS Haase
J Bhagat
J Cinatl
JA Wells
K Wolstencroft
M Guharoy
M Li
M Li
M Li
M Petukh
M Tyagi
MK Gilson
MP Mazanetz
N Estrada-Ortiz
P Aloy
P Aloy
P Filippakopoulos
R Mosca
RR Thangudu
S Beisken
S Kim
S Shangary
S Teng
T Rolland
W Yang
WS Valdar
Y Wang
Y Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

We describe a computational protocol to aid the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anti-cancer therapy. To achieve this goal, we explore multiple strategies, including finding binding hot spots, incorporating chemical similarity and bioactivity data, and sampling similar binding sites from homologous protein complexes. We demonstrate how to combine existing interdisciplinary resources with examples of semi-automated workflows. Finally, we discuss several major problems, including the occurrence of drug-resistant mutations, drug promiscuity, and the design of dual-effect inhibitors.Fil: Goncearenco, Alexander. National Institutes of Health; Estados UnidosFil: Li, Minghui. Soochow University; China. National Institutes of Health; Estados UnidosFil: Simonetti, Franco Lucio. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Shoemaker, Benjamin A. National Institutes of Health; Estados UnidosFil: Panchenko, Anna R. National Institutes of Health; Estados Unido

Crossref

CONICET Digital

Combination of scoring schemes for protein docking

Author: B Huang
C Zhang
CM Deane
D Kozakov
Dietmar Schomburg
F Melo
G Moont
H Neuvirth
I Halperin
IN Shindyalov
J Mintseris
JE Dennis
KE Gottschalk
L Lo Conte
M Meyer
O Martin
O Zimmermann
P Aloy
P Caffrey
P Chakrabarti
P Heuser
Philipp Heuser
R Development Core Team
RB Schnabel Koontz J.
RM Jackson
S Jones
V Grimm
WS Valdar
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background Docking algorithms are developed to predict in which orientation two proteins are likely to bind under natural conditions. The currently used methods usually consist of a sampling step followed by a scoring step. We developed a weighted geometric correlation based on optimised atom specific weighting factors and combined them with our previously published amino acid specific scoring and with a comprehensive SVM-based scoring function. Results The scoring with the atom specific weighting factors yields better results than the amino acid specific scoring. In combination with SVM-based scoring functions the percentage of complexes for which a near native structure can be predicted within the top 100 ranks increased from 14% with the geometric scoring to 54% with the combination of all scoring functions. Especially for the enzyme-inhibitor complexes the results of the ranking are excellent. For half of these complexes a near-native structure can be predicted within the first 10 proposed structures and for more than 86% of all enzyme-inhibitor complexes within the first 50 predicted structures. Conclusion We were able to develop a combination of different scoring schemes which considers a series of previously described and some new scoring criteria yielding a remarkable improvement of prediction quality.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins

Author: A Dubey
A Koike
A Rossi
AH Liu
AJ Bordner
AJ Bordner
AR Panchenko
AT Laurie
B Pils
B Thibert
B Wang
B Wilczynski
C Sander
C Yan
C Yan
C Zhang
CC Chang
D La
DH Morgan
F Osterberg
G Cheng
H Chen
H Deng
H Neuvirth
H Yao
H Yao
HX Zhou
I Res
I Xenarios
IM Nooren
IM Nooren
J Meiler
JL Chung
JR Bradford
JR Bradford
JW Torrance
K Henrick
KA Snyder
L Lo Conte
Lei Lin
MH Li
O Lichtarge
P Chakrabarti
Q Dong
Qiwen Dong
Qw Dong
QW Dong
S Jones
S Karlin
S Liang
SF Altschul
T Down
TJ Magliery
V Chelliah
VN Vapnik
W Kabsch
WS Valdar
WS Valdar
Xiaolong Wang
Y Kim
Y Ofran
Y Ofran
Yi Guan
Z Zhang
Publication venue: BioMed Central
Publication date: 01/05/2007
Field of study

Abstract Background Recognition of binding sites in proteins is a direct computational approach to the characterization of proteins in terms of biological and biochemical function. Residue preferences have been widely used in many studies but the results are often not satisfactory. Although different amino acid compositions among the interaction sites of different complexes have been observed, such differences have not been integrated into the prediction process. Furthermore, the evolution information has not been exploited to achieve a more powerful propensity. Result In this study, the residue interface propensities of four kinds of complexes (homo-permanent complexes, homo-transient complexes, hetero-permanent complexes and hetero-transient complexes) are investigated. These propensities, combined with sequence profiles and accessible surface areas, are inputted to the support vector machine for the prediction of protein binding sites. Such propensities are further improved by taking evolutional information into consideration, which results in a class of novel propensities at the profile level, i.e. the binary profiles interface propensities. Experiment is performed on the 1139 non-redundant protein chains. Although different residue interface propensities among different complexes are observed, the improvement of the classifier with residue interface propensities can be negligible in comparison with that without propensities. The binary profile interface propensities can significantly improve the performance of binding sites prediction by about ten percent in term of both precision and recall. Conclusion Although there are minor differences among the four kinds of complexes, the residue interface propensities cannot provide efficient discrimination for the complicated interfaces of proteins. The binary profile interface propensities can significantly improve the performance of binding sites prediction of protein, which indicates that the propensities at the profile level are more accurate than those at the residue level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

State of the art: refinement of multiple sequence alignments

Author: A Marchler-Bauer
AB Robinson
AJ Jennings
Anna R Panchenko
C Notredame
C Notredame
CB Do
Christopher J Lanczycki
GJ Barton
IM Wallace
J Chen
J Heringa
J Heringa
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JF Gibrat
K Katoh
K Katoh
O Gotoh
Paul A Thiessen
RC Edgar
S Chakrabarti
Saikat Chakrabarti
SR Eddy
Stephen H Bryant
T Lassmann
T Madej
Teresa M Przytycka
WR Taylor
WS Valdar
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Accurate multiple sequence alignments of proteins are very important in computational biology today. Despite the numerous efforts made in this field, all alignment strategies have certain shortcomings resulting in alignments that are not always correct. Refinement of existing alignment can prove to be an intelligent choice considering the increasing importance of high quality alignments in large scale high-throughput analysis. RESULTS: We provide an extensive comparison of the performance of the alignment refinement algorithms. The accuracy and efficiency of the refinement programs are compared using the 3D structure-based alignments in the BAliBASE benchmark database as well as manually curated high quality alignments from Conserved Domain Database (CDD). CONCLUSION: Comparison of performance for refined alignments revealed that despite the absence of dramatic improvements, our refinement method, REFINER, which uses conserved regions as constraints performs better in improving the alignments generated by different alignment algorithms. In most cases REFINER produces a higher-scoring, modestly improved alignment that does not deteriorate the well-conserved regions of the original alignment

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automatic prediction of catalytic residues by modeling residue structural neighborhood

Author: A Ceroni
A Humm
A Yamaguchi
AC Wallace
AE Todd
Andrea Passerini
CT Porter
E Chea
E Webb
E Youn
EF Pettersen
Elisa Cilia
G Amitai
G Bartlett
J Bernardes
J Davis
J Ebert
J Mistry
JA Capra
JC Nebel
JD Fischer
KM Borgwardt
L Xie
M Babor
M Lippi
M Ondrechen
MM Benning
N Cristianini
N Nagano
N Shu
NV Petrova
P Gherardini
RD Finn
S Kawashima
SF Altschul
T Joachims
T Zhang
W Tong
WS Valdar
Y Tang
Y Wei
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Prediction of catalytic residues is a major step in characterizing the function of enzymes. In its simpler formulation, the problem can be cast into a binary classification task at the residue level, by predicting whether the residue is directly involved in the catalytic process. The task is quite hard also when structural information is available, due to the rather wide range of roles a functional residue can play and to the large imbalance between the number of catalytic and non-catalytic residues.Results: We developed an effective representation of structural information by modeling spherical regions around candidate residues, and extracting statistics on the properties of their content such as physico-chemical properties, atomic density, flexibility, presence of water molecules. We trained an SVM classifier combining our features with sequence-based information and previously developed 3D features, and compared its performance with the most recent state-of-the-art approaches on different benchmark datasets. We further analyzed the discriminant power of the information provided by the presence of heterogens in the residue neighborhood.Conclusions: Our structure-based method achieves consistent improvements on all tested datasets over both sequence-based and structure-based state-of-the-art approaches. Structural neighborhood information is shown to be responsible for such results, and predicting the presence of nearby heterogens seems to be a promising direction for further improvements.Journal ArticleResearch Support, N.I.H. Extramuralinfo:eu-repo/semantics/publishe

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DI-fusion

Multidimensional Scaling Reveals the Main Evolutionary Pathways of Class A G-Protein-Coupled Receptors

Author: A Gogos
A Rokas
A Rokas
B Wu
C Kuiken
C Tuffley
David Thybert
DE Gloriam
DG Higgins
DJ Smith
DK Vassilatis
E Susko
F Lu
F Murtagh
G Blackshields
G Casari
H Abdi
H Abdi
H Rompler
HC Wang
Hervé Abdi
I Domazet
I Kass
IG Choi
J Devillé
J Hou
J Hou
J Tzeng
JC Gower
JH Park
JS Surgand
Julien Pelé
JW DeLano
K Palczewski
K Ye
KB Nicholas
KJ Woolley
Kolakowski LF Jr
M Anctil
M Greenacre
M Nei
MA Larkin
Marie Chabbert
Matthieu Moreau
MC Peeters
MW Trosset
P Rousseeuw
P Scheerer
R Fredriksson
R Fredriksson
RA Studer
RP Metpally
S Yohannan
S Yohannan
SC Sealfon
SG Rasmussen
T Haitina
U Gether
V Cherezov
Vladimir N. Uversky
VN Grishin
W Shi
WM Fitch
WS Togerson
WS Valdar
Y Takane
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Class A G-protein-coupled receptors (GPCRs) constitute the largest family of transmembrane receptors in the human genome. Understanding the mechanisms which drove the evolution of such a large family would help understand the specificity of each GPCR sub-family with applications to drug design. To gain evolutionary information on class A GPCRs, we explored their sequence space by metric multidimensional scaling analysis (MDS). Three-dimensional mapping of human sequences shows a non-uniform distribution of GPCRs, organized in clusters that lay along four privileged directions. To interpret these directions, we projected supplementary sequences from different species onto the human space used as a reference. With this technique, we can easily monitor the evolutionary drift of several GPCR sub-families from cnidarians to humans. Results support a model of radiative evolution of class A GPCRs from a central node formed by peptide receptors. The privileged directions obtained from the MDS analysis are interpretable in terms of three main evolutionary pathways related to specific sequence determinants. The first pathway was initiated by a deletion in transmembrane helix 2 (TM2) and led to three sub-families by divergent evolution. The second pathway corresponds to the differentiation of the amine receptors. The third pathway corresponds to parallel evolution of several sub-families in relation with a covarion process involving proline residues in TM2 and TM5. As exemplified with GPCRs, the MDS projection technique is an important tool to compare orthologous sequence sets and to help decipher the mutational events that drove the evolution of protein families

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

HAL-Inserm

PubMed Central

Okina

Optimized Hydrophobic Interactions and Hydrogen Bonding at the Target-Ligand Interface Leads the Pathways of Drug-Designing

Author: A Bhinge
A Fernandez
A Gangjee
A Simeonov
AC Wallace
AK Ghosh
Akulapalli Sudhakar
AM Davis
Ashley Stanley
Ashok K. Varma
AW White
B Moza
BR Brooks
C Hansch
CM Venkatachalam
CU Kim
DL Mobley
E Krissinel
E Petsalaki
GR Desiraju
GR Desiraju
H Sun
L Choulier
Lumbani Yadav
M Von Itzstein
MA Larkin
N Joseph
NA Roberts
NR Taylor
O Hantschel
P Csermely
P Fontana
P Szuromi
R Maiti
RJ Russell
RK Kondru
RL DesJarlais
Rohan Patil
S Ohlson
S Schenone
S Verma
SA Lipton
SB Bhise
SB Qian
SG Dimagno
SJ Parsons
SK Panigrahi
SK Panigrahi
Sridhar Hannenhalli
Suranjana Das
T Blundell
WS Valdar
Y Lu
Publication venue: Public Library of Science
Publication date: 16/08/2010
Field of study

Weak intermolecular interactions such as hydrogen bonding and hydrophobic interactions are key players in stabilizing energetically-favored ligands, in an open conformational environment of protein structures. However, it is still poorly understood how the binding parameters associated with these interactions facilitate a drug-lead to recognize a specific target and improve drugs efficacy. To understand this, comprehensive analysis of hydrophobic interactions, hydrogen bonding and binding affinity have been analyzed at the interface of c-Src and c-Abl kinases and 4-amino substituted 1H-pyrazolo [3, 4-d] pyrimidine compounds.In-silico docking studies were performed, using Discovery Studio software modules LigandFit, CDOCKER and ZDOCK, to investigate the role of ligand binding affinity at the hydrophobic pocket of c-Src and c-Abl kinase. Hydrophobic and hydrogen bonding interactions of docked molecules were compared using LigPlot program. Furthermore, 3D-QSAR and MFA calculations were scrutinized to quantify the role of weak interactions in binding affinity and drug efficacy.The in-silico method has enabled us to reveal that a multi-targeted small molecule binds with low affinity to its respective targets. But its binding affinity can be altered by integrating the conformationally favored functional groups at the active site of the ligand-target interface. Docking studies of 4-amino-substituted molecules at the bioactive cascade of the c-Src and c-Abl have concluded that 3D structural folding at the protein-ligand groove is also a hallmark for molecular recognition of multi-targeted compounds and for predicting their biological activity. The results presented here demonstrate that hydrogen bonding and optimized hydrophobic interactions both stabilize the ligands at the target site, and help alter binding affinity and drug efficacy

Public Library of Science (PLOS)

Crossref

PubMed Central

Identification of human-to-human transmissibility factors in PB2 proteins of influenza A by large-scale mutual information analysis

Author: A Honda
A Muzzi
AJ Buckler-White
AM Khan
AP Kendal
AT Heiny
BT Korber
C Scholtissek
C Schonbach
CE Mills
CE Shannon
DL Wheeler
E Ghedin
E Poole
EK Subbarao
G Neumann
GW Chen
J Liu
J Mukaigawa
J Pei
J Thomas August
JC Obenauer
JK Taubenberger
JK Taubenberger
JM Chen
K Yusim
L Paninski
LC Martin
M Hatta
MA Nobrega
N Naffakh
O Miotto
O Miotto
Olivo Miotto
P Fechter
R Steuer
RA Gatenby
RC Edgar
RG Webster
S Stephens
SJ Baigent
The UniProt Consortium
Tin Wee Tan
TJ Lee
V Brusic
V Gregory
Vladimir Brusic
WS Valdar
YP Lin
Z Dawy
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The identification of mutations that confer unique properties to a pathogen, such as host range, is of fundamental importance in the fight against disease. This paper describes a novel method for identifying amino acid sites that distinguish specific sets of protein sequences, by comparative analysis of matched alignments. The use of mutual information to identify distinctive residues responsible for functional variants makes this approach highly suitable for analyzing large sets of sequences. To support mutual information analysis, we developed the AVANA software, which utilizes sequence annotations to select sets for comparison, according to user-specified criteria. The method presented was applied to an analysis of influenza A PB2 protein sequences, with the objective of identifying the components of adaptation to human-to-human transmission, and reconstructing the mutation history of these components. Results We compared over 3,000 PB2 protein sequences of human-transmissible and avian isolates, to produce a catalogue of sites involved in adaptation to human-to-human transmission. This analysis identified 17 characteristic sites, five of which have been present in human-transmissible strains since the 1918 Spanish flu pandemic. Sixteen of these sites are located in functional domains, suggesting they may play functional roles in host-range specificity. The catalogue of characteristic sites was used to derive sequence signatures from historical isolates. These signatures, arranged in chronological order, reveal an evolutionary timeline for the adaptation of the PB2 protein to human hosts. Conclusion By providing the most complete elucidation to date of the functional components participating in PB2 protein adaptation to humans, this study demonstrates that mutual information is a powerful tool for comparative characterization of sequence sets. In addition to confirming previously reported findings, several novel characteristic sites within PB2 are reported. Sequence signatures generated using the characteristic sites catalogue characterize concisely the adaptation characteristics of individual isolates. Evolutionary timelines derived from signatures of early human influenza isolates suggest that characteristic variants emerged rapidly, and remained remarkably stable through subsequent pandemics. In addition, the signatures of human-infecting H5N1 isolates suggest that this avian subtype has low pandemic potential at present, although it presents more human adaptation components than most avian subtypes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

ScholarBank@NUS

University of Queensland eSpace